Eye Gaze for Attention Prediction in Multimodal Human-Machine Conversation
نویسندگان
چکیده
In a conversational system, determining a user’s focus of attention is crucial to the success of the system. Motivated by previous psycholinguistic findings, we are currently examining how eye gaze contributes to automated identification of user attention during humanmachine conversation. As part of this effort, we investigate the contributing roles of various features that are extracted from eye gaze and the visual interface. More precisely, we conduct a data-driven evaluation of these features and propose a novel evaluation metric for performing such an investigation. The empirical results indicate that gaze fixation intensity serves an integral role in attention prediction. Fixations to objects are fairly evenly distributed between the start of a reference and 1500 milliseconds prior. When combined with some visual features (e.g., the amount of visual occlusion of an object), fixation intensity can become even more reliable in predicting user attention. This paper describes this empirical investigation of features and discusses the further implication of attention prediction based on eye gaze for language understanding in multimodal conversational interfaces.
منابع مشابه
An Exploration of Eye Gaze in Spoken Language Processing for Multimodal Conversational Interfaces
Motivated by psycholinguistic findings, we are currently investigating the role of eye gaze in spoken language understanding for multimodal conversational systems. Our assumption is that, during human machine conversation, a user’s eye gaze on the graphical display indicates salient entities on which the user’s attention is focused. The specific domain information about the salient entities is ...
متن کاملIncorporating Temporal and Semantic Information with Eye Gaze for Automatic Word Acquisition in Multimodal Conversational Systems
One major bottleneck in conversational systems is their incapability in interpreting unexpected user language inputs such as out-ofvocabulary words. To overcome this problem, conversational systems must be able to learn new words automatically during human machine conversation. Motivated by psycholinguistic findings on eye gaze and human language processing, we are developing techniques to inco...
متن کاملAutomated Vocabulary Acquisition and Interpretation in Multimodal Conversational Systems
Motivated by psycholinguistic findings that eye gaze is tightly linked to human language production, we developed an unsupervised approach based on translation models to automatically learn the mappings between words and objects on a graphic display during human machine conversation. The experimental results indicate that user eye gaze can provide useful information to establish such mappings, ...
متن کاملWB4-3 Fritz - a Humanoid Communication Robot
In this paper, we present the humanoid communication robot Fritz. Our robot communicates with people in an intuitive, multimodal way. Fritz uses speech, facial expressions, eye-gaze, and gestures to interact with people. Depending on the audio-visual input, our robot shifts its attention between different persons in order to involve them into the conversation. He performs human-like arm gesture...
متن کاملThe Role of Interactivity in Human-Machine Conversation for Automatic Word Acquisition
Motivated by the psycholinguistic finding that human eye gaze is tightly linked to speech production, previous work has applied naturally occurring eye gaze for automatic vocabulary acquisition. However, unlike in the typical settings for psycholinguistic studies, eye gaze can serve different functions in human-machine conversation. Some gaze streams do not link to the content of the spoken utt...
متن کامل